Graph-Modeled Data Clustering
نویسندگان
چکیده
ly speaking, what have we done in the previous section? After applying a number of rules in polynomial time to an instance of VERTEX COVER, we arrived at a reduced instance whose size can solely be expressed in terms of the parameter k. Since this can be easily done in O(n) time, we have found a data reduction for VERTEX COVER with guarantees concerning its running time as well as its effectiveness. These properties are formalized in the concepts of a problem kernel and the corresponding kernelization [25]. Definition 1.2. Let L be a parameterized problem, that is, L consists of input pairs (I, k), where I is the problem instance and k is the parameter. A reduction to a problem kernel (or kernelization) means to replace an instance (I, k) by a reduced instance (I ′, k′) called problem kernel in polynomial time such that (1) k′ ≤ k, (2) I ′ is smaller than g(k) for some function g only depending on k, and (3) (I, k) has a solution if and only if (I ′, k′) has one. While this definition does not formally require that it is possible to reconstruct a solution for the original instance from a solution for the problem kernel, all kernelizations we are aware of easily allow for this. The methodological approach of kernelization, including various techniques of data reduction, is best learned by the concrete examples that we discuss in Sec. 1.3; there, we will also discuss kernelizations for VERTEX COVER that even yield a kernel with a linear number of vertices in k. To conclude this section, we state some useful general observations and remarks concerning Definition 1.2 and its connections to fixed-parameter tractability. Most notably, there is a close connection between fixed-parameter tractable problems and those problems that have a problem kernel—they are exactly the same. Theorem 1.1 (Cai et al. [11]). Every fixed-parameter tractable problem is kernelizable and vice-versa. Unfortunately, the practical use of this theorem is limited: the running times of a fixed-parameter algorithm directly obtained from a kernelization is usually not practical; and, in the other direction, the theorem does not constructively provide us with a data reduction scheme for a fixed-parameter tractable problem. Hence, the main use of Theorem 1.1 is to establish the fixed-parameter tractability or amenability to kernelization of a problem—or show that we need not search any
منابع مشابه
Graph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members
Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...
متن کاملFixed-Parameter Algorithms for Graph-Modeled Date Clustering
We survey some practical techniques for designing fixedparameter algorithms for NP-hard graph-modeled data clustering problems. Such clustering problems ask to modify a given graph into a union of dense subgraphs. In particular, we discuss (polynomial-time) kernelizations and depth-bounded search trees and provide concrete applications of these techniques. After that, we shortly review the use ...
متن کاملFinding Community Base on Web Graph Clustering
Search Pointers organize the main part of the application on the Internet. However, because of Information management hardware, high volume of data and word similarities in different fields the most answers to the user s’ questions aren`t correct. So the web graph clustering and cluster placement in corresponding answers helps user to achieve his or her intended results. Community (web communit...
متن کاملA Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملA Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007